Here I show three diagnostics: Convergence and efficiency, Traceplots, and Pairs plots.
From the Stan documentation on convergence and efficiency diagnostics for Markov Chains:
The Rhat function produces R-hat convergence diagnostic, which compares the between- and within-chain estimates for model parameters and other univariate quantities of interest. If chains have not mixed well (ie, the between- and within-chain estimates don’t agree), R-hat is larger than 1. We recommend running at least four chains by default and only using the sample if R-hat is less than 1.05. Stan reports R-hat which is the maximum of rank normalized split-R-hat and rank normalized folded-split-R-hat, which works for thick tailed distributions and is sensitive also to differences in scale.
The ess_bulk function produces an estimated Bulk Effective Sample Size (bulk-ESS) using rank normalized draws. Bulk-ESS is useful measure for sampling efficiency in the bulk of the distribution (related e.g. to efficiency of mean and median estimates), and is well defined even if the chains do not have finite mean or variance.
The ess_tail function produces an estimated Tail Effective Sample Size (tail-ESS) by computing the minimum of effective sample sizes for 5% and 95% quantiles. Tail-ESS is useful measure for sampling efficiency in the tails of the distribution (related e.g. to efficiency of variance and tail quantile estimates).
Both bulk-ESS and tail-ESS should be at least 100 (approximately) per Markov Chain in order to be reliable and indicate that estimates of respective posterior quantiles are reliable.
In the plots below, I will show these three diagnostics for the models run for each of our six populations.
Given that each model is estimating hundreds of parameters, traceplots for all parameters will not be shown here. Instead, traceplots for a few parameters from each model will be shown here.
Pairs plots show univariate histograms and bivariate scatter plots for selected parameters, and are especially useful for identifying collinearity between variables (which manifests as narrow bivariate plots) as well as the presence of multiplicative non-identifiabilities (banana-like shapes).
For these pairs plots, I am showing all of the different parameters that exist for a single movement (e.g., the intercept, origin, and temperature parameters that govern the probability of a single movement, such as ascending a specific dam). Again, there are too many movements across the six models to show all of these plots, but I will show examples of potential issues related to collinearity in our predictors. A description of collinearity of predictors in regressions can be found in the Stan User Guide.
After making the change such that each fish receives only either a DPS-specific or an origin-specific intercept term (which fixed nonsensical collinearity issues between origin and intercept parameters), there are still two common cases of collinearity that are important to be aware of for interpreting parameter estimates:
For ascents of dams, particularly those that are downstream of natal tributaries, the intercept parameter (b0) and the temperature parameter (btemp1) that estimates the effect of river temperature during summer/fall (Jun 1 - Dec 31) are highly negatively correlated. Shown above are two pairs plots for the parameters that govern the probability of ascending McNary Dam for Upper Columbia Wild Steelhead and Snake River Hatchery Steelhead. As seen in the plots, b0 and btemp1 are strongly negatively correlated, while b0 and btemp0 (the effect of river temperature during winter/spring, Jan 1 - May 1) are also negatively correlated.
My understanding of why this is happening is that this upstream movement is the dominant movement out of this state (as reaching natal tributaries for all fish from these DPSs requires ascending McNary Dam), and this upstream movement is also happening when temperatures are highest (in alignment with migration in the late summer/early fall). The negative collinearity arises from the pattern that when the intercept (b0) draw is low, the temperature (btemp1) draw must be high to explain the high probability of that movement, and vice versa.
Interestingly, the parameter estimates for the effect of temperature on ascending McNary Dam (a necessary movement for these fish to reach their natal tributaries) indicate that warmer temperatures actually decrease the probability of making this ascent, therefore making other movements (such as movements into tribuatries like the Deschutes, or into the loss state) more likely in warmer conditions.
Of course, temperature and day of year are strongly linked, as seen by these plots showing counts of ascents of McNary Dam by day of year and temperature, for Upper Columbia and Snake River Steelhead:
For fallback movements, especially for fallback over BON, the intercept (b0_matrix_2_1) and spill window (bspillwindow_matrix_2_1) parameters are negatively correlated, with a similar pattern as that seen for the relationship between the btemp and b0 parameters shown above.
The same pattern is observed for spill as is observed for temperature: fallback events occur primarily at high spill levels, but also with distinct seasonality (these en-route fallback events are happening primarily when fish have just ascended Bonneville Dam in late summer). The negative collinearity arises from the pattern that when the intercept (b0) draw is low, the spill (bspillwindow) draw must be high to explain the observed movements, and vice versa.
The correlation between temperature + intercept and spill + intercept don’t really concern me too much - the posteriors seem to be properly behaved. They just need to be interpreted as representing seasonality as much as they are representing the covariates themselves.
Big picture: The correlation between parameters emphasizes that parameter estimates cannot be interpreted in isolation. Instead, posterior predictive estimates of movements are necessary to interpret the meaning of parameters.